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Abstract 

We investigate the problem of Bayesian updating of a probability 
distribution encoded in the quantum state of n qubits. The updating 
procedure takes the form of a quantum algorithm that prepares the 
quantum register in the state representing the posterior distribution. 
Depending on how the prior distribution is given, we describe two 
implementations, one probabilistic and one deterministic, of such an 
algorithm in the standard model of a quantum computer. 

1 Introduction 

Bayes's rule provides a simple and fundamental mechanism for updating 
a probability distribution in the light of new data pp. The rule takes its 
simplest form for a finite sample space, H, where the elements /i G H can be 
identified with the atomic events, or hypotheses. Let Pprior(^) = -P(^) be the 
prior probability distribution, and assume some piece of data, d, is observed. 
If P{d\h) is the conditional probability of d, given h, Bayesian updating 
consists of replacing the prior with the posterior distribution, Pposterior = 
P{h\d), where 

Pihld) = P(d\h)Pih) 
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To simplify the notation, we assume from now on that the set of hypothe- 
ses is of the form EI = {0, . . . , 2" — 1} for some positive integer n. For G H, 
let \h) denote the computational basis states of a register of n qubits. The 
state 

|*pnor) = $^V^|/^) (2) 

feGH 

provides an encoding of the prior on the quantum register. Even though the 
size of the sample space grows exponentially with the number of qubits, n, 
there exists an interesting class of priors for which prior) can be prepared 
efficiently, in the sense that the required computational resources grow only 
polynomially with n [21 El ■ 

To formulate the problem of Bayesian updating for a prior encoded on a 
quantum register, we make the assumption that we have a classical algorithm 
that computes, as a function of h, the conditional probability P{d\h) for the 
observed data d. Given this classical algorithm, the goal of Bayesian updating 
is then to prepare the register in the state 

[^posterior) = ^P{h\d)\h) , (3) 

with P{h\d) given by Eq. (Q). If the prior is given to us in the form of 
a single copy of the state prior), our problem is equivalent to finding a 
quantum operation, M^, that maps any prior state of the form into the 
corresponding posterior state of the form (jHl), 

Mdl^'prior) = I ^posterior) • (4) 

It is easy to see that Md cannot in general be a trace-preserving map. For 
example, consider the two prior states 

l*prior) = ;^(|l) + |2)), l*prior) = ^(|2) + |3)), (5) 

corresponding to two different prior probability distributions, and assume 
that the conditional probability distribution is given by 

P{,d\h) I ^ _^ g otherwise, 

where c is a constant determined by normalization. Although the prior states 
are nonorthogonal, we obtain mutually orthogonal posterior states 

l^postorior) -^^o! | ^prior) |-^) ' l^posterior) -^^o! | ^prior) 1"^) ' C^) 
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which imphes that is trace-decreasing. Bayesian updating of a single 
copy of I \1/ prior) IS therefore generally probabilistic. Section II of this paper 
discusses probabilistic Bayesian updating. 

A deterministic updating scheme is possible, however, if the prior is given 
in the form of a unitary quantum circuit that maps a standard state, assumed 
for simplicity to be the computational basis state |0), to I^P prior)- Determin- 
istic updating is the topic of Section III. 

2 Probabilistic algorithms 

As we have shown above there is in general no trace preserving quantum 
operation that can transform all prior states into the corresponding posterior 
state. To reahze probabilistic Bayesian updating, we proceed as follows. 
Define 

E, = CY,^PW)\h){h\, (8) 

where C is a constant and Spr is a set containing the support of the prior 
probability distribution. We see that 

-El I Sprier) OC I ^posterior) • (9) 

For sufficiently small |C|, see Eq. fll5|) below, one can view Ei as an element 
of a trace preserving quantum operation £ defined, for arbitrary p, by 

1 1 
£{p) = Y,EkpEl = Y,PkP{k), (10) 

fe=0 k=0 

where 

Pk = i:T{EkpEl) and p{k) = EkpEl/pk . (11) 

This decomposition shows that the operation S can be realized as a measure- 
ment with outcomes k = 0,1, where each outcome k happens with probabil- 
ity and the corresponding conditional density matrix is p{k). Substituting 
p = I ^E'prior) (^priori we See that the measurement outcome 1 corresponds to 
successful Bayesian updating. This happens with probability 

Pl = (^priorl-EjEil^prior) = P(/l)P(t/| /l) = C^P{d) . (12) 

h 
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In order to obtain a bound on C, we note that 

E^Eq = 1 - eIEi = 1-C^Y, P{d\h) \h){h\ . (13) 

Using the positivity of E^Eq, we find 

C'< (j2P(d\h)\{v\h)A (14) 
\henpr J 

for any vector \v). 

Now let h* be such that P{d\h*) = maxhfzSp, P{d\h). Since the above 
condition is vahd for any \v), one can choose \v) = \h*) and obtain 

<l/maxP{d\h). (15) 

Together with Eq. (fT^ this gives an upper bound on the success probabihty 
of Bayesian updating 

In the next subsection we describe an exphcit algorithm that achieves this 
bound. 



2.1 Explicit algorithm 

The operation S can be realized as a modification of a procedure proposed 
by Rudolph j4j as follows. First we prepare the product of the prior state 
and an auxiliary qubit state, |\l/prior) |0). Then, using the classical algorithm 
for computing P{d\h), one can construct a quantum circuit Ud that performs 
a conditional rotation of an auxiliary qubit so that 

t/d|^prior)|0) = J2v^\h){Mhm + B^{h)\l)^ , (17) 

h 

where 

Ai{h) = ci^/P{d\h), Bf = 1 - Aj = 1 ~ clP{d\h) , (18) 



4 



and Ci is a constant. Then measuring the auxihary qubit we obtain the 
desired state |\l/posterior) |0) with probabihty 

p^ = clJ2Pih)P{d\h) = clP{d). (19) 

h 

Looking at Eqs.lfTTj) and (fTHj) we can set cf = 1/ maxh<zSp^ P {d\h) . With this 
setting, pi achieves the theoretical bound on the success probabihty, Eq.(fTB|). 

In the above algorithm, one can safely achieve the maximal success prob- 
ability only if the knowledge of the value of max/jggp^ P{d\h) is available. It is 
relevant to mention here that the lack of such knowledge does not prevent us 
from using the above algorithm, since we can always use the trivial setting 
cf = 1. The price to pay is a smaller success probability. 

An intermediate situation occurs if a nontrivial upper bound on P{d\h) 
is known, i.e., a constant M such that maxh^§^^ P{d\h) < M < 1. One can 
then set ci = 1/M, which improves the success probability compared to the 
trivial setting. 

2.2 Iterative algorithm 

Let Ml be an upper bound on max/igSp^. P{d\h). Imagine that at the beginning 
we do not have enough information about P{d\h) and P{h) to calculate a 
nontrivial value for Mi. In other words, we have to assume that Mi = 1. 
Imagine also that we expect to acquire a better bound M2 < Mi in the future. 
We will now address the following question: Can we run the probabilistic 
algorithm of Sec. 12. II first with the trivial bound Mi = 1, and later with the 
improved bound M2, without reducing the overall success probability that 
can be achieved by running the algorithm once with the bound M2? We will 
find that this is indeed the case. This result remains true for a sequence of 
bounds, Mk < M^-i < ■ ■ ■ < Mi. Below we describe an iterative version 
of the above algorithm that makes use of better bounds as they become 
available. 

Consider the measurement part of the algorithm of Sec. 12.11 If the mea- 
surement fails, which happens with probability 1 — pi, we end up with the 

state 

= [NiY,^P{h)Bm\h))\l) , Ni' = l-clP{d), (20) 

h 
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where we might have set cf = 1/Mi to maximize pi. Since we know the exact 
form of we may attempt to achieve our original goal by performing a 
transformation 



l^i) 
where we set 



Ni Yl V^)Bi{h)\h){A,{h)\0) + , (21) 



Mh)=c,^^^^, Bl = {1 - Al)Bl = Bl - clPid\h) , (22) 

and C2 is a constant. First of all, it is important to note that this procedure 
should not be attempted when cf was set to 1/Mi, and Mi is still the best 
available bound. This is because in the worst case there will be at least 
one hypotheses h* which is present in the sum Eg. 1)211) with Bi{h*) = and 
A2{h*) > 1. It follows that the above procedure should only be applied if a 
better bound M2 > Mi became available (or when < 1/Mi). In this case, 
measurement of the auxiliary qubit yields the desired state |\l/posterior)|0) with 
probability 

= Nld X: P{h)P{d\h) = j^^^ . (23) 
Alternatively, with probability 1—^2, we may end up with the state 

1^2) = [N2Y,^/mB2{h)\h))\l) . (24) 

h 

This state is similar in structure to the state \il)i) so we may try to recover 
in the same way by performing the transformation 

1^2) N,Y,^/mB2{h)\h)(A,{hm + , (25) 

h 

followed by the measurements of the auxiliary qubit in complete analogy with 
our earlier analysis. By continuing this procedure we obtain the sequence of 
success probabilities pi,P2, ■ ■ ■ together with the coefficients {Al} and {Bf}. 
We have 

Mh) = c,^^^, Bl = Bl,-clP{d\h), (26) 
Bk-i{h) 
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and 

. - ^^^(^) (27) 

where B'^_^ = i?Q = 1, Cq = and 

{Bl) = Y^P{h)Bl{h). (28) 

h 

The constants {c^} are the only free parameters in this algorithm. As we 
have seen in the case = 1, the constants {cfc} cannot be chosen freely, and 
the optimal choice for them depends on the sequence {M^.}. From Eq. (j26|) 
we obtain 

Bl = l~P{d\h)Y,cl>Q, (29) 

s=l 

and therefore 

This condition must be satisfied for all h in the support of the prior and so 
we have 

^ 2 1 

§''-max,,s^,P(d|/.)- ^^^^ 
From Eqs. and we compute 

k-2 

{Bl_,) = l-P{d)J2cl (32) 

s=l 

Together with Eq. (P7j). this implies 

1 - P{d) }2s=l 

The probability that the algorithm is not successful after the nth stage is 
given by 

n n 

Pflii = 11(1 - p,) = 1 - P{d) cl , (34) 

k=l s=l 
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which gives the corresponding success probabihty 



n 



^succ = 1 - ^"aii = Pid)J2cl < P{d)/m^xP{d\h) 

/iGSpr 



s=l 



(35) 



where we used the inequahty ()31|) . We see that the theoretical bound for the 
overall success probability of transforming one copy of the prior state I^E' prior) 
into one copy of the posterior state I^E' posterior) is achieved for as long as at 
some stage n of the algorithm we have 



Given the sequence of upper bounds Mi > M2 > ■ ■ ■ > M^, and assuming 
that the information in the first — 1 of them was already used without 
success, the optimal value for the next iteration of the algorithm, which 
takes into account the bound M^, can be calculated as 



3 Deterministic updating 

In this section we will assume that the prior is given in the form of a unitary 
quantum circuit, U, that maps the computational basis state |0), to the prior 
state. Apart from the constraint U\0) = I^E' prior); U is arbitrary. We first give 
an algorithm for the special case of hypothesis elimination and then show 
how to extend it to two- valued and more general models. 

3.1 Hypothesis elimination 

Imagine the situation where each piece of data d partitions the set of hypothe- 
ses EI into two subsets: containing all hypotheses that are consistent with 
d, and H \ containing all hypotheses that are rejected by the data d. This 
leads to a special case of Bayesian updating where P{d\h) takes only two 
different values 




(36) 




fc-i 



(37) 




(38) 
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where |E[^| is the number of hypotheses that are consistent with the data d. 
The posterior state takes the simple form 

I ^posterior) = N VW)\h) , (39) 

where is the normahzation factor. 

Using the given classical algorithm for computing P{d\h), we define a 
quantum oracle, Od, as 

Furthermore, let 11 be a conditional phase shift defined by 

These operations are combined with U to form an operation. A, defined by 

El 

A = U-^UUOd . (42) 
The circuit for A is the basic block of the quantum algorithm to prepare 

I ^posterior) • 

It will be convenient to rewrite the prior state (0) in the form 

l^^prior) = sin - I a) + cos - , (43) 
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where 



l«) = ^mJ' E I^) , Su,= Yl ^(^) ' (44) 

sin - = . (46) 

The last equation shows that knowing the total prior probability of the hy- 
potheses that are consistent with the data d is equivalent to knowing the 
value of {}. 



and 
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It can now be shown that repeated apphcation of the circuit A takes 
l^prior) through the sequence of states 

^l^prior) = Sin (^^^) l«) + COS \P) . (47) 

The number of times, T, of apphcations of A that achieve the required trans- 
formation, 

l^prior) 1'-'^) I ^posterior) ; (48) 

is therefore 

T= (7r/?9- l)/2. (49) 

If T is not an integer, there are two possibilities. Either one uses the closest 
integer approximation to T and includes the effect of the noninteger part in 
the fidelity analysis (see below), or one follows [TJ applications of A with 
one application of a modified version of A where phases are shifted by less 
than e*^ in both Od and H |7|. 

In order to compute the number of iterations, T, the value of must be 
known. To obtain -d, a version of the standard phase estimation algorithm 
jH] can be used as illustrated in Figure [TJ 

To calculate the effect of an error in the value of on the fidelity of the 
Bayesian transformation (jH)), we assume that there is an upper bound on 
the absolute error, 

A'd>\^-^\, (50) 

where -d denotes the approximate value. With the definition T = (vr/'i? — 1)/2, 
the fidelity is 

'2f + 1 

T 

Substituting = ± Ai) and using the relation (2T + 1)?? = tt we obtain 



K^postcriorl^^l^prior)! = slu (^^y^^) ■ (51) 



/2T + 1 \ ttA^ ^ /7rA^9\2 

F = cos A?9 =cos — ^>1- — ^ . 52 

V 2 J 2^ ~ \ 2§ ) ^ ' 

3.2 Two- valued models 

A straightforward generalization of hypothesis elimination is provided by a 
two-valued conditional probability of the form 

p{d\h) = i if ^ee,, 

I 02 otherwise , 
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A 



2° 



A 



2i 



A' 



Figure 1: This is the standard phase-estimation circuit apphed to the 
hypothesis-ehmination operator A. A measurement of the upper t-qubit 
register returns the value of with an accuracy of m bits and a probabihty 
of success of at least 1 — e, where m and e are related to each other and to 
t via the condition t = m + [log(2 + l/2e)] . The gates labeled if®* and FT 
are the t-qubit Hadamard and quantum Fourier transforms, respectively. 



where ai > a2 are constants, and is the set of hypotheses favored by the 
data d. The suppression coefficient r = ai/a2 measures how much hypotheses 
in Hrf are favored by the data. As before, the prior state can be written in 
the form, Ea. (H!?|) . 



1^ 



prior) = sin - \a) + cos- 



and for the posterior state we calculate 



l^posterior) = ^fol siu 2 I") + ^OS ^ 

Normalization of the posterior state implies that 

1 

02 



rsin^(79/2) + cos2(^9/2) 



(54) 



(55) 



(56) 
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Defining d' so that 



^' ^ § cos(^9/2) , ^, 

cos — = Ja2 cos — = — , , 57 

2 ^ 2 ^rsin2(^9/2) + cos2(^9/2) 

the number of iterations T necessary to transform prior) into |\l/postcrior) = 
-^"^l^prior) can then be calculated as 

T{:d,r) = {'d'/^-l)/2. (58) 

It follows that knowledge of d and the suppression coefficient r is sufficient 
for a deterministic implementation of Bayesian updating with the conditional 
distribution As before, the value of 'd can be obtained using the algo- 

rithm of figure d and the same fidelity bound (j52|l can be used. 



3.3 Bayesian updating: general models 

In this section we show how to generalize the above algorithm to the case of 
Bayesian updating with a general model, i.e., a general conditional distribu- 
tion P{d\h). The main idea is to represent P{d\h) as a product of two- valued 
models with known suppression coefficients. Bayesian updating with P{d\h) 
can then be viewed as a sequence of Bayesian updatings for the two-valued 
models. 

Let Ck{h) be the coefficients in the binary expansion of log2 P{d\h), 

CO 

log, P{d\h) = Y,Ck{h) 2-''. (59) 

k=l 

This allows us to express P{d\h) product, 

oo 

P{d\h) = Y['V2P^. (60) 

k=l 

Let Hdj. be the set of hypotheses {h} for which Ck{h) = 1. The A;th term 

k I— 

in this product is either V2 or 1 depending on whether h is in H^^. or not. 
Bayesian updating with the conditional probability P{d\K) can therefore be 
viewed as a sequence of stages corresponding to the acquisition of data from 
the sequence di,d2, .... At each stage, an updating step for a two-valued 
model as described in the previous section is carried out. 
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